NON-PROVISIONAL APPLICATION FOR UNITED STATES PATENT 



FOR 



DUTY CYCLE COMPENSATION IN CLOCK CIRCUIT 



Inventors 
Darren Slawecki 



Prepared by: William A. Newton, Reg. No. 28101 
Schwabe, Williamson & Wyatt, PC 
Pacwest Center 

121 1 SW Fifth Ave., Ste 1600-1900 
Portland, Oregon 97204 
wnewton@schwabe.com 



Attorney Docket No.: 110348-133034 
Intel Tracking No: PI 6695 



Express Mail Label No. EL973636906US 
Filing Date: September 30, 2003 



Attorney Docket No. 1 10348-133034 
Intel Tracking No. P1 6695 



Express Mail No. EL973636906US 



BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

[0001] The present invention relates to, but is not limited to, electronic devices, and in 
particular, to the field of clock circuits. 

2. Description of Related Art 

[0002] Clock signals are basic elements in digital circuits. A clock signal, generated by 
a clock generator, may be used to trigger flip-flops, serve as a timing reference, provide 
data and address strobing, and perform many other timing and control functions. To 
distribute the clock signal to various circuit elements, a clock distribution circuit is used. 

[0003] The clock pulse signal has a frequency and a duty cycle. The duty cycle is 
defined as the ratio between the high period over the entire period of the signal. The 
ideal duty cycle for a clock signal is 50%. The reason clocks become unbalanced, 
drifting away from the 50% duty cycle, is that a digital logic element may have an 
asymmetric response to rising and falling waveforms, so that the propagation delay for 
the logic element differs for rising and falling clock edges. The clock signal propagating 
through the logic element is either shortened or lengthened by this difference in 
propagation delay. 

[0004] Automatic test equipment (ATE) is often used to test and debug critical speed 
paths on newly designed microprocessors. The ATE is connected to the microprocessor 
to control a clock shrink circuit, which generates a test clock used to drive one or more 
functional units contained therein. The functional units include, for example, the data 
path, input units, execution units, cache, output units, and the like. The clock shrink 
circuit uses a technique called "clock shrinking", by which the frequency of a clock (or 
group of clocks) is changed dynamically during the execution of a microprocessor. The 
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term "shrinking" is used to denote that the frequency of a clock cycle of interest is 
reduced relative to other clock cycles. Clock shrinking is a debug tool for testing newly 
designed microprocessors and other types of integrated circuits. By shrinking a single 
clock (and leaving the other clocks at a lower, passing frequency), a single critical path 
can be isolated in a test or diagnostic that contains many critical speed paths. 

[0005] FIG. 1 illustrates a prior art, on-chip clock shrink circuit 10 for shifting the phase 
of a clock signal (CLOCK). The clock shrink circuit 10 includes two identical circuit 
portions, a rise mirror circuit 12 and a fall mirror circuit 14. Each mirror circuit 12 and 
14 includes the same components, with the mirror circuit 12 having a front end inverter 
16 with an output signal CLOCKB, an inverting variable pull-up delay stage 18 and an 
inverting output stage 20 with an output signal CLOCKMID. The fall mirror circuit 14 
is shown with an output signal CLOCKOUT. 

[0006] Referring to FIG. 2, the operation of the clock shrink circuit 10 of the prior art is 
shown for an illustrative regular frequency (generally below 4 GHz) by showing in a 
timing diagram of the signals CLOCK, CLOCKB, CLOCKMID, and CLOCKOUT. The 
signal CLOCKB is an inverted, delayed version of the CLOCK signal, but retains the 
50% duty cycle. The delay stage 18 of the rise mirror circuit 12 creates significantly 
different rise and fall propagation delays as illustrated by the CLOCKMID signal, with 
the rising input to falling output being much larger than the falling input to rising output. 
As such, the output duty cycle is significantly different than the input duty cycle, e.g., 
50% input duty cycle results in a greater than 70% duty cycle output. In other words, 
the CLOCKMID waveform is more high than low during this regular frequency 
operation. With the assistance of the fall mirror circuit 14, the CLOCKOUT signal is the 
desired delayed clock signal with a 50% duty cycle. 

[0007] Referring to FIG. 3, the operation of the clock shrink circuit 10 is shown for a 
high frequency (generally above 4 GHz) by again showing in a timing diagram the 
signals CLOCK, CLOCKB, CLOCKMID, and CLOCKOUT. As the clock frequency is 
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increased, eventually the shrink circuit 10 becomes a frequency limiter. The 
CLOCKOUT signal no longer toggles in the high frequency operation. 

[0008] Although the shrink circuit 10 is generally acceptable for frequencies 
approximately under 4 GHz, the clock shrink circuit 10 has insufficient bandwidth of 
operation as a serial circuit in the clock distribution path having frequencies 
approximately greater than 4 GHz. Currently, the maximum frequency of operation of 
the shrink circuit 10 is close to the nominal part frequency. This limits the maximum 
device frequency as the part speed is increased by fixing paths in the design. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

[0009] FIG. 1 is a schematic diagram of a prior art clock shrink circuit. 

[0010] FIG. 2 is a timing diagram showing the prior art clock shrink circuit operating 
with a regular frequency. 

[0011] FIG. 3 is a timing diagram showing the prior art clock shrink circuit operating a 
high frequency. 

[0012] FIG. 4 is a schematic diagram of a clock shrink circuit according to one 
embodiment of the invention. 

[0013] FIG. 5 is a timing diagram for the embodiment of the clock shrink circuit shown 
in FIG. 4 operating at a regular frequency. 

[0014] FIG. 6 is a timing diagram for the embodiment of the clock shrink circuit shown 
in FIG. 4 operating at a high frequency. 

[0015] FIG. 7 is a system showing one embodiment of the clock shrink circuit operating 
with automatic test equipment. 



5 



Attorney Docket No. 1 10348-133034 
Intel Tracking No. P16695 



Express Mail No. EL973636906US 



DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS 

[0016] In the following description, for purposes of explanation, numerous details are 
set forth in order to provide a thorough understanding of the disclosed embodiments of 
the present invention. However, it will be apparent to one skilled in the art that these 
specific details are not required in order to practice the disclosed embodiments of the 
present invention. In other instances, well-known electrical structures and circuits are 
shown in block diagram form in order not to obscure the disclosed embodiments of the 
present invention. 

[0017] FIG. 4 illustrates one embodiment of an on-chip clock shrink circuit 30 for 
delaying a clock signal (CLOCK). The clock shrink circuit 30 includes two identical 
circuit portions, a rise mirror circuit 32 and a fall mirror circuit 34. Each mirror circuit 
32 and 34 includes the same components. The clock signal, which is generated by a 
clock generator (not depicted), is provided as an input to the rise mirror circuit 32. The 
clock generator uses a phase-locked loop ("PLL") to generate the clock with a 50% duty 
cycle. The rise mirror circuit 32 has an output signal (CLOCKMID), which is coupled 
to an input of the fall mirror circuit 34. At its output, the fall mirror circuit 34 generates 
an output signal (CLOCKOUT). 

[0018] Each of the rise and fall mirror circuits 32 and 34 includes the identical 
components, which are designated by the same reference numbers. The rise circuit 32 
includes an inverting first matching stage 36 for providing a first inverted signal 
(CLOCKB), an inverting first pull-up stage 38 with a variable phase delay for providing 
a second inverted signal and an inverting first output stage 40 for providing the third 
inverted signal CLOCKMID. In other words, all three stages 36, 38 and 40 are inverting 
stages with inverted output signals. 
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[0019] The first matching stage 36 includes a non-inverting logic element 42 and a NOR 
gate 44 with one input connected to the input of the non-inverting logic element 42 and 
the other input coupled to the output of the non-inverting logic element 42. 

[0020] Likewise, the fall mirror circuit 34 includes an inverting second matching stage 
36 to provide a fourth inverted signal, an inverting second pull-up stage 38 with a 
variable phase delay to provide a fifth inverted signal and an inverting second output 
stage 40 to provide a sixth inverted signal, the output clock signal (CLOCKOUT), with 
such output clock signal having the desired phase delay. Since the second stages 36, 38, 
and 40 are identical in structure to first stages 36, 38, and 40, they will not be described 
further. 

[0021] In this illustrative embodiment, each of non-inverting logic elements 42 of the 
rise and fall mirror circuits 32 and 34 includes at least a pair of cascaded inverters 46. 
Although two inverters 46 are shown in FIG. 4, any even number of cascaded inverters 
46 may be used to match the input and output duty cycles of the rise and fall mirror 
circuits 32 and 34, as will be described hereinafter. Other combinations of logic 
elements may provide the function of the first matching stage 36. For example, although 
two inverters 46 are shown, any non-inverting logic element may be used. For the 
matching stage 36, other logic configurations that delay one edge more than another may 
be substituted. 

[0022] Referring to the timing diagram of FIG. 5, the operation of the one embodiment 
of the clock shrink circuit 30 is shown wherein the input clock signal (CLOCK) has an 
illustrative regular frequency (generally below 4 GHz) and a 50% duty cycle. In the rise 
mirror circuit 32, a first pulse width change is introduced into the first inverted output 
signal (CLOCKB) of the first matching stage 36. More specifically, the high periods are 
shortened and low periods are lengthened for the first inverted output signal, causing the 
duty cycle to drop below the 50% duty cycle. This is caused by the propagation delay of 
the falling edge of the CLOCK signal through the first matching stage 36 exceeding the 
propagation delay of the rising edge of the CLOCK signal. Since the first matching stage 
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36 also provides an inverter function, this leads to the falling edge of signal CLOCKB 
being delayed less than the rising edge of signal CLOCKB. As will be described 
hereinafter, the intentional introduction of asymmetric propagation delays (different rise 
and fall propagation delays) for the first matching stage 36 is intended to cancel the 
nominal pulse width distortion introduced by the first pull-up stage 38. 

[0023] The first pull-up stage 38 produces a second inverted signal with a pulse width 
change (distortion) due to the introduction of asymmetric propagation delays. The 
second inverted signal is then inverted by the output stage 40 without significant 
asymmetric propagation delays as shown by third inverted signal (CLOCKMID). The 
second inverted signal is not shown in FIG. 5, since the CLOCKMID signal fully shows 
the pulse width distortion introduced by the asymmetric propagation delays of the pull- 
up stage 38. The signal on the node after CLOCKB signal and before CLOCKMID 
signal is the. same as CLOCKMID signal only inverted. In other words, it has been 
corrected to be back to a 50% duty cycle. The signal CLOCKMID has returned to an 
approximately 50% duty cycle, with the pulse width distortion introduced by the first 
matching stage 36 canceling out or compensating for the pulse width distortion 
introduced by the first pull-up stage 38. 

[0024] In summary, it should be noted that the second inverted signal and the third 
inverted signal CLOCKMID have returned to an approximately 50% duty cycle of the 
input clock signal CLOCK. This is because the first matching stage 36 introduces pulse 
width compression (duty cycle offset) by having different rise and fall propagation 
delays through the first matching stage 36 which cancels out the pulse width expansion 
of the pull-up stage 38. The amount of pulse width compression of the first matching 
stage 36 substantially equals the amount of pulse width expansion of the first pull-up 
stage 38. 

[0025] With respect to the rise mirror circuit 32, the first matching stage 36 introduces a 
first pulse width change deviating from a 50% duty cycle to cancel out a second pulse 
width change (distortion) caused by the first pull-up stage 38 to achieve a substantially 
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50% duty cycle. In other words, the first matching stage 36 creates a first matching 
delay (difference between the rise and fall propagation delays of the first matching stage 
36) to compensate for or cancel a first pull-up delay (difference between the rise and fall 
propagation delays of the first pull-up stage 38). The first matching and first pull-up 
delays have substantially equal but opposite effects on the pulse widths and therefore the 
duty cycle. It should also be noted that pulse width compression is the same as pulse 
width expansion of the inverted signal. 

[0026] The same process occurs in the fall mirror circuit 34 as occurs in the rise mirror 
circuit 32, except the fall mirror circuit 34 starts with the third inverted signal 
CLOCKMH), whereas the rise mirror circuit 32 starts with the input clock signal 
CLOCK. As previously described for the fall mirror circuit 34, the pulse width 
distortion by the second pull-up stage 38 is cancelled by the previously introduced pulse 
width change (duty cycle offset) of the second matching stage 36. Consequently, the 
desired CLOCKOUT signal, with an approximately 50% duty cycle, is generated at the 
output of the fall mirror circuit 34, such signal having been delayed (phase shifted) 
relative to the input CLOCK signal by both pull-up stages 38. High frequency operation 
of the clock shrink circuit 30 is assured once the rising output and falling output delays 
are matched. 

[0027] With respect to the fall mirror circuit 34, the second matching stage 36 
introduces a third pulse width change deviating from a 50% duty cycle to cancel out a 
fourth pulse width change (distortion) caused by the second pull-up stage 38, resulting in 
a substantially 50% duty cycle. In other words, the second matching stage 36 creates a 
second matching delay (difference between the rise and fall propagation delays of the 
second matching stage 36) to compensate for or cancel a second pull-up delay (difference 
between the rise and fall propagation delays of the second pull-up stage 38). The second 
matching and second pull-up delays have substantially equal but opposite effects on the 
pulse widths and therefore the duty cycle. 



9 



Attorney Docket No. 1 10348-133034 Express Mail No. EL973636906US 

Intel Tracking No. P16695 

[0028] Referring to FIG. 6, the operation of the clock shrink circuit 30 is shown for a 
high frequency (generally above 4 GHz) by again showing in a timing diagram of the 
signals CLOCK, CLOCKB, CLOCKMH), and CLOCKOUT. As the frequency of the 
CLOCK signal is increased, both the waveforms CLOCKMID and CLOCKOUT 
maintain an approximately 50% duty cycle. If the rising and falling delays had not been 
balanced, the clock shrink circuit 30 would limit the maximum device frequency. The 
embodiment of the clock shrink circuit 30 of FIG. 4 fixes the duty cycle problem by 
allowing for greatly increased circuit bandwidth without significant increases in phase 
jitter. The shrink circuit 30 improves on the external duty cycles (signals CLOCKMID 
and CLOCKOUT) by creating an internal signal that has a non-ideal duty cycle 
(CLOCKB signal). Since the loading on the CLOCKB node is capacitively loaded 
significantly lower than other nodes in the rise mirror circuit 32 (also the case for the fall 
mirror circuit 34), it can be easily designed to operate with the duty cycle offset of the 
first matching stage 36; hence, the clock shrink circuit 30 does not limit the maximum 
device operation. The embodiment of FIG. 4 does not have the prior art shrink circuit's 
disadvantages of having higher jitter (which equates to lower device frequency of 
operation) and the upper frequency limit. 

[0029] FIG. 7 (divided over FIGS. 7A and 7B) is a schematic diagram of the pull-up 
stages 38 of the rise and fall mirror circuits, with such stages 38 being coupled to an 
automated test equipment (ATE) and control circuitry 50. However, the pull-up stage 
38 and ATE and control circuitry 50 are the same as found in the prior art design of FIG. 
1 and therefore will not be described in detail. The ATE is indirectly coupled to the SEL 
input and a gate of the first transistor 62 of each of the unitcells 60. More specifically, 
the ATE does not directly drive the SEL input and gates, but drives the SEL input and 
the gates through the control circuitry, which is known logic, to provide known control 
signals ctrlO, Ctrl 1 , ctrl2, and ctrl3 (not shown). Also, FIG. 7 shows the detailed 
schematics of the matching stage 36 and the output stage 40 shown in FIG. 4, in addition 
to the pull-up stage 38. In FIG. 4 the pull-up stages 38 in the rise mirror circuit 32 and 
the fall mirror circuit 34 are shown simplified as a single variable rise delay inverter. 
Based upon the control signals provided by the ATE and control circuitry 50, the pull-up 
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stages 38 are set to introduce a phase delay into the CLOCKMID signal or the 
CLOCKOUT signal, depending upon which stage 38 is being considered, i.e., the one in 
the rise mirror circuit 32 or the one in the fall mirror circuit 34 respectively (see FIG. 4). 
The phase delay of each delay stage 38 is initially set in the middle range of the pull-up 
stages 38 and the ATE and control circuitry 50 may increase or decrease the amount of 
this phase delay. The more phase delay introduced, the more the output signals 
(CLOCKMID and CLOCKOUT) are delayed per edge respectively. In other words, the 
rise mirror circuit 32 only adds or subtracts delay to the rise edge of the CLOCKOUT 
signal (which is the fall edge of CLOCKMID signal) and the fall mirror circuit 34 only 
adds or subtracts delay to the fall edge of CLOCKOUT signal. As the ATE and control 
circuitry 50 increases the phase delay of the pull-up stages 38, additional inverters 
generally are not required at the input of the NOR gate 44. In other words, once the 
difference in rising and falling propagation delays is corrected (compensated for) by 
using either 2, 4, 6, etc., inverters 46, the resulting duty cycle offset does not need to be 
changed with the introduction by the ATE and control circuitry 50 of differing amounts 
of phase shift for the CLOCKOUT signal. Note that the phase shifts of the first and 
second pull-up stages 38 are accumulative, but impact different edges. 

[0030] The ATE of the ATE and control circuitry 50 is basically a sequence engine with 
memory to drive the stimulus and compare the results; hence, it may be described by the 
term "stimulus response machine". In this particular embodiment, the ATE comprises 
either an IMS Vanguard DM1000 tester (IMS is now owned by Creedence) or a 
Schlumberger S9000 series tester. Optional equipment may be used, such as a built-in 
scope and a PMU (parameteric measurement unit). 

[0031] Referring to FIG. 7, the matching stage 36 includes the same components 
previously described with respect to FIG. 4: the non-inverting logic element 42 having at 
least the pair of inverters 46 and the NOR gate 44. The NOR gate 44 includes p-channel 
transistors PI a and Plb, which are in series and connected between the supply voltage 
VCC and the node 58, and n-channel transistors Nla and Nib, which are connected in 
parallel between the node 58 and ground. The matching stage 36 is shown with the input 
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signal CLOCK as is the case when the matching stage 36 is used in the rise mirror circuit 
32; however, when the matching stage 36 is used in the fall mirror circuit 34 the input 
signal is CLOCKMID. The signal on the node 58 is the CLOCKB signal, as previously 
shown in FIG. 4. 

[0032] The pull-up stage 38 includes a plurality of unitcells 60 coupled to the supply 
voltage VCC, with each unitcell 60 including a first p-channel transistor 62. All COM 
outputs are connected to the same node, namely node 66. Referring to FIG. 8, the 
unitcell 60 is illustrated in more detail and includes a second p-channel transistor 64. 
The unitcells 60 are arranged in a binary weighted scheme. Each set are connected to an 
individual signal driven by the ATE and control circuitry 50 (previously mentioned 
control signals ctrl0-3) and are double the previous amount. For instance, the unitcell 60 
on the right hand side is a single instance. To the left of it, the first unitcell 60 connected 
to the ATE 50 is 2 instances, then 4 instances, then 8 instances and finally 16 instances 
on the left hand side. Through the use of the digital input controls from the ATE and 
control circuitry 50, the bias voltage is precisely set for a gate of a p-channel transistor 
p2. By the unitcells 60 setting a bias voltage level, they control the delay through the 
pull-up stage 38. 

[0033] Referring back to FIG. 7, the COM output of the last unitcell 60, the gate of the 
transistor p2, a drain of a p-channel transistor p3, a drain of an n-channel transistor n2, 
and an n-channel transistor n4 are commonly coupled to a node 66. The drain of the 
transistor p2 is connected to a node 68, the source of transistor p3 is connected to the 
supply voltage Vcc and a source of transistor n2 is coupled to ground. A transistor p4 
has its source coupled to the supply voltage Vcc and its drain coupled to the node 68. 
The transistor p5 has its source coupled to the supply voltage Vcc and its drain 
commonly coupled with the transistor n4 and the gate of transistor p4. A transistor p6 
has its source coupled to the supply voltage Vcc and its drain coupled to the node 68 and 
an n-channel transistor n5 in the output stage 40. An n-channel transistor n3 has its 
drain connected to the node 68 and its source to ground. An inverter 70 has its input 
connected to a KILL signal and its output coupled to the gates of transistors n2, p6, and 
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p3. The gates of transistors n4 and p5 are connected to a FINSEL signal. The pull-up 
stage 38 has input signals to control the steps and a range select as well (coarse or fine), 
which modulates the step sizes (for example, 60ps range for coarse versus 40ps range for 
fine) via the FINSEL signal. There is another feature of the pull-up stage 38 that allows 
the circuit to ignore an input transition so as to keep the output signal constant via the 
KILL signal. The most common application for the KILL signal is to remove a clock 
from a series, commonly known as "kill-a-clock". 

[0034] The output stage 40 includes a p-channel transistor p7 and the n-channel 
transistor n5. The transistor p7 has its source connected to the voltage supply Vcc, its 
drain connected to an output node 72, and its gate commonly connected to the drain of 
the transistor p6, the gate of the transistor n5, and the node 68. The transistor n5 has its 
drain connected to the output node 72 and its source connected to ground. The output 
node 72 provides either the CLOCKMID or CLOCKOUT signal, depending upon 
whether the output stage 40 is located in the rise mirror circuit 32 or the fall mirror 
circuit 34. 

[0035] Although specific embodiments have been illustrated and described herein, it 
will be appreciated by those of ordinary skill in the art that any arrangement which is 
calculated to achieve the same purpose may be substituted for the specific embodiment 
shown. This application is intended to cover any adaptations or variations of the 
embodiments of the present invention. Therefore, it is manifestly intended that this 
invention be limited only by the claims. 



13 



